Identifying Software Complexity Topics with Latent Dirichlet Allocation on Design Patterns
نویسندگان
چکیده
منابع مشابه
Online Inference of Topics with Latent Dirichlet Allocation
Inference algorithms for topic models are typically designed to be run over an entire collection of documents after they have been observed. However, in many applications of these models, the collection grows over time, making it infeasible to run batch algorithms repeatedly. This problem can be addressed by using online algorithms, which update estimates of the topics as each document is obser...
متن کاملComplexity of Inference in Latent Dirichlet Allocation
We consider the computational complexity of probabilistic inference in Latent Dirichlet Allocation (LDA). First, we study the problem of finding the maximum a posteriori (MAP) assignment of topics to words, where the document’s topic distribution is integrated out. We show that, when the e↵ective number of topics per document is small, exact inference takes polynomial time. In contrast, we show...
متن کاملTopicXP: Exploring topics in source code using Latent Dirichlet Allocation
Acquiring general understanding of large software systems and components from which they are built can be a time consuming task, but having such an understanding is an important prerequisite to adding features or fixing bugs. In this paper we propose the tool, namely TopicXP, to support developers during such software maintenance tasks by extracting and analyzing unstructured information in sou...
متن کاملMeasuring Correlation Between Linguist's Judgments and Latent Dirichlet Allocation Topics
Data that has been annotated by linguists is often considered a gold standard on many tasks in the NLP field. However, linguists are expensive so researchers seek automatic techniques that correlate well with human performance. Linguists working on the ScamSeek project were given the task of deciding how many and which document classes existed in this previously unseen corpus. This paper invest...
متن کاملExperiments with Latent Dirichlet Allocation
Latent Dirichlet Allocation is a generative topic model for text. In this report, we implement collapsed Gibbs sampling to learn the topic model. We test our implementation on two data sets: classic400 and Psychological Abstract Review. We also discuss the different evaluation of goodness-of-fit of the models how parameter settings interact with the goodness-of-fit.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Informatica Economica
سال: 2019
ISSN: 1453-1305,1842-8088
DOI: 10.12948/issn14531305/23.4.2019.01